IRNLP@KAIST in Subtask of Research Papers Classification in NTCIR-8

نویسندگان

  • Bashar Al-Shboul
  • Sung-Hyon Myaeng
چکیده

In this paper, we present a novel query expansion approach based on splitting the user query into a set of N-grams, and expanding them separately utilizing a set of research articles. Our approach is based on retrieving a set of relevant research articles, process their abstracts to expand the query/searched term or phrase. We aim to expand terms that a regular relevance feedback might ignore. Our work shows an improvement over several classification levels compared to several methods of expansion.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Overview of the Patent Mining Task at the NTCIR-8 Workshop

This paper introduces the Patent Mining Task at the Eighth NTCIR Workshop and the test collections produced in this task. The purpose of the Patent Mining Task is to create technical trend maps from a set of research papers and patents. We performed two subtasks: (1) the subtask of research papers classification and (2) the subtask of technical trend map creation. For the subtask of research pa...

متن کامل

Hiroshima City University at NTCIR-8 Patent Mining Task

Our group participated in the subtask of technical trend map creation for the NTCIR-8 Patent Mining Task. We prepared five types of cue phrase list using statistical methods, and used them in the analysis of research papers and patents based on the Support Vector Machines. From the experimental results, we obtained Recall of 0.110 and Precision of 0.424 for research papers, and Recall of 0.430 ...

متن کامل

Multi-label Classification using Logistic Regression Models for NTCIR-7 Patent Mining Task

We design a multi-label classification system based on a machine learning approach for the NTCIR-7 Patent Mining Task. In our system, we employ a logistic regression model for each International Patent Classification (IPC) code that determines the IPC code assignment of research papers. The logistic regression models are trained by using patent documents provided by task organizers. To mitigate...

متن کامل

Overview of Classification Subtask at NTCIR-6 Patent Retrieval Task

This paper describes the Classification Subtask of the NTCIR-5 Patent Retrieval Task. The purpose of this subtask is to evaluate the methods of classifying patents into multi-dimensional classification structures called F-term (File Forming Term) classification systems. We report on how this subtask was designed, the test collection released, and the results of the evaluation.

متن کامل

Experiments for NTCIR-8 Technical Trend Map Creation Subtask at Hitachi

This paper reports on an experiment to evaluate the extraction of effect expressions from patents and papers (in Japanese) at the subtask of Technical Trend Map Creation in NTCIR-8 Patent Mining Task. To obtain a more detailed structure for the expressions, we defined that effect expressions consist of TARGET, SCALE and IMPACT elements. We created training data based on these elements and assig...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2010